Table
The table node is used to connect a single table to a data source node. Unlike the Tables node, many nodes can be connected to a single data source, with each node representing a different table. A range of operations and functions can then be connected to any given table node in order to perform any required data cleansing or manipulation on the data set before loading it into a target.
Table nodes are a good option if your data set contains tables on which you need to perform any kind of data preparation; the alternative to the table node is the tables node, which does not support data cleansing processes in the Data Flow.
Configure Table Nodes
There are two methods of adding Table nodes to the Data Flow:
Method 1
The first method is to select all the required tables from the Properties panel of the data source node, and then click Add Tables (green highlight below):
Each selected table is now represented by its own node:
Method 2
The second method requires you to add each Table node individually, connecting them to the data source node (green arrow below), and to then inject the relevant table (blue arrow) into each Table node from the Properties panel. You can also rename the table from each node (red highlight).
Additional options:
- If the table includes value-based filters, you can select the Ignore value-based filters checkbox (green highlight) to remove them for the new model, making the entire dataset available for further analysis.
- If your data source is an Excel or Google Sheets spreadsheet, you should select the First row has column names checkbox if your source table includes a header row (not shown).
Preview Result Set
The result set of each table can be previewed by clicking on the relevant node, and then clicking the preview icon from the node's Properties panel, or from the Preview panel. By default, the preview is limited to the first 50 rows, but this can be changed from the Preview Size drop down-in the ribbon.
- Click here to learn more about the Preview panel.
Filters
The inline filter allows you to limit the query results by defining a query condition, rather than having to apply a filter after the query has been run. This limits the query results, avoiding filtering after the query has been run. Each query condition is created by selecting a column name and creating a comparison (equal to, greater than, and so on) between it and another column, variable, or value.
Note: The Filters panel is not shown where the data source is one of the file sources (for example, if your data source is Excel).
To create a Query Condition:
- Click plus (+) to add a query condition.
- Set up the first operand for your Query Condition:
- Select the column name to be used as the operand for the filter (red box above).
- Then select the operator (green box), from the options. This can be one of equals, greater than, and so on.
- Next, select the second operand for your Query Condition:
- Select the type (orange box below) for the second operand. This can be one of column, variable, or value.
- Select or enter the value to compare your first operand to.
- Repeat as required, considering:
- Multiple conditions need to be "joined" to prior conditions.
- You can, optionally, group multiple conditions.
A Query Condition row opens.
Second Operand
The second operand selected (blue box below) will depend on the selected comparison type:
- If column is selected, then a drop-down list of all columns will be displayed, allowing you to select a column
- If variable is selected, then a drop-down list of all variables will be displayed, allowing you to select a variable
- If value is selected, then enter the actual value.
Multiple conditions
Join conditions
- If multiple conditions are required, click on the "+" sign (red box below) to add an additional query condition.
- Select AND or OR (yellow box) to determine how the two conditions should be joined.
Group Conditions
If more than two conditions are required, you might want to group conditions together:
- Select the checkboxes of the columns that you want to group together (orange box).
- Click on the group button (red box) to group the columns
Column Selection
Expand the Column Selection panel to update the column selection for the given table. By default, all columns in the table are selected but you can remove columns by clearing their checkboxes. Columns that have been deselected will not be copied to the new data model.
- Select All / Select: Select those columns that you want to copy to the new data model.
- Tooltip: Hover your cursor over the data type icon to see the tooltip describing the column number (starting at 0), the column name, and the data type.
- Data Type mapping: Type or select your required data type using this combo box (green highlight). Updating this value changes the data type for the selected column before it is passed on to the next node or the target. This mapping option is typically only relevant for file sources.
Note: The data type for String(N) indicates a string length "N." The data type for Decimal(P,S) indicates the number of digits "P" and decimal places "S."
Example: You may want to use this mapping if your spreadsheet only includes numeric zip codes. In this situation, the field will be detected as being an integer because of its content, but since zip codes can also include letters, you should use this combo box to change the data type to String(255).
Variables
Expand the Set Variable Values window to pass a variable to the node. To do this, click the plus sign and then select the relevant variable, the aggregation type, and the relevant column.
In this example, the table contains the income column on the left in the image below. We want to create a variable that will be used to load only rows where the income was 75000 and above (as in the column on the right).
To accomplish this, a variable is created named VarIncome, where the value is set to 75000:
This variable is then injected into the table node where the Income column is located:
Finally, the Filter node is connected and used to filter the table by the Income column, using the variable:
Metadata
Expand the Metadata panel and add the following:
- Description: Add a description to this node. This is useful for keeping track of the ETL process, especially if multiple users are working with the same ETL. The description is visible only in the Model app.
- Validate: If you do not want to validate this node when you run the validation process, for example because it is under construction and contains some invalid scripting temporarily, you can clear the selection of the Validate checkbox. This is not recommended.
- Node ID: The unique ID for this node. Click the copy icon to copy this ID to your clipboard.
Edit Table Nodes
To edit a Table node, click on it and go to its Properties panel. Here you can update the node's preview and its various settings.
You can rename the table (red box below), change the table selection (green box), change the filter selection (yellow box) change the column selection (orange box), set incremental loading via a variable, (yellow arrow), and add a description (red arrow) that will be visible only in the Model app.